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Abstract 

The recursive least-squares (RLS) algorithm has well-documented merits for reducing complexity and 
storage requirements, when it comes to online estimation of stationary signals as well as for tracking slowly- 
varying nonstationary processes. In this paper, a distributed recursive least-squares (D-RLS) algorithm is 
developed for cooperative estimation using ad hoc wireless sensor networks. Distributed iterations are 
obtained by minimizing a separable reformulation of the exponentially-weighted least-squares cost, using 
the alternating-minimization algorithm. Sensors carry out reduced-complexity tasks locally, and exchange 
messages with one-hop neighbors to consent on the network-wide estimates adaptively. A steady-state 
mean-square error (MSE) performance analysis of D-RLS is conducted, by studying a stochastically-driven 
'averaged' system that approximates the D-RLS dynamics asymptotically in time. For sensor observations 
that are linearly related to the time-invariant parameter vector sought, the simplifying independence setting 
assumptions facilitate deriving accurate closed-form expressions for the MSE steady-state values. The 
problems of mean- and MSE-sense stability of D-RLS are also investigated, and easily-checkable sufficient 
conditions are derived under which a steady-state is attained. Without resorting to diminishing step-sizes 
which compromise the tracking ability of D-RLS, stability ensures that per sensor estimates hover inside 
a ball of finite radius centered at the true parameter vector, with high-probability, even when inter-sensor 
communication links are noisy. Interestingly, computer simulations demonstrate that the theoretical findings 
are accurate also in the pragmatic settings whereby sensors acquire temporally-correlated data. 
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I. Introduction 

Wireless sensor networks (WSNs), whereby large numbers of inexpensive sensors with constrained 
resources cooperate to achieve a common goal, constitute a promising technology for applications as 
diverse and crucial as environmental monitoring, process control and fault diagnosis for the industry, and 
protection of critical infrastructure including the smart grid, just to name a few. Emergent WSNs have 
created renewed interest also in the field of distributed computing, calling for collaborative solutions that 
enable low-cost estimation of stationary signals as well as reduced-complexity tracking of nonstationary 
processes; see e.g., [22], [33]. 

In this paper, a distributed recursive least-squares (D-RLS) algorithm is developed for estimation and 
tracking using ad hoc WSNs with noisy links, and analyzed in terms of its stability and mean-square error 
(MSE) steady-state performance. Ad hoc WSNs lack a central processing unit, and accordingly D-RLS 
performs in-network processing of the (spatially) distributed sensor observations. In words, a two-step 
iterative process takes place towards consenting on the desired global exponentially-weighted least-squares 
estimator (EWLSE): sensors perform simple local tasks to refine their current estimates, and exchange 
messages with one-hop neighbors over noisy communication channels. New sensor data acquired in real 
time enrich the estimation process and learn the unknown statistics 'on-the-fly'. In addition, the exponential 
weighting effected through a forgetting factor endows D-RLS with tracking capabilities. This is desirable 
in a constantly changing environment, within which WSNs are envisioned to operate. 

A. Prior art on distributed adaptive estimation 

Unique challenges arising with WSNs dictate that often times sensors need to perform estimation in a 
constantly changing environment without having available a (statistical) model for the underlying processes 
of interest. This has motivated the development of distributed adaptive estimation schemes, generalizing 
the notion of adaptive filtering to a setup involving networked sensing/processing devices [3, Sectionl-B]. 

The incremental (I-) RLS algorithm in [24] is one of the first such approaches, which sequentially 
incorporates new sensor data while performing least-squares estimation. If one can afford maintaining 
a so-termed Hamiltonian cyclic path across sensors, then I-RLS yields the centralized EWLS benchmark 
estimate. Reducing the communication cost at a modest price in terms of estimation performance, an I-RLS 
variant was also put forth in [24]; but the NP-hard challenge of determining a Hamiltonian cycle in large- 
size WSNs remains [18]. Without topological constraints and increasing the degree of collaboration among 
sensors, a diffusion RLS algorithm was proposed in [3]. In addition to local estimates, sensors continuously 
diffuse raw sensor observations and regression vectors per neighborhood. This facilitates percolating new 
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data across the WSN, but estimation performance is degraded in the presence of communication noise. When 
both the sensor measurements and regression vectors are corrupted by additive (colored) noise, the diffusion- 
based RLS algorithm of [1] exploits sensor cooperation to reduce bias in the EWLSE. All [3], [1] and [24] 
include steady-state MSE performance analysis under the independence setting assumptions [23, p. 448]. 
Distributed least mean-squares (LMS) counterparts are also available, trading off computational complexity 
for estimation performance; for noteworthy representatives see [7], [14], [26], and references therein. Recent 
studies have also considered more elaborate sensor processing strategies including projections [8], [12], 
adaptive combination weights [30], or even sensor hierarchies [4], [26], and mobility [32]. 

Several distributed (adaptive) estimation algorithms are rooted on iterative optimization methods, which 
capitalize upon the separable structure of the cost defining the desired estimator. The sample mean estimator 
was formulated in [20] as an optimization problem, and was solved in a distributed fashion using a 
primal dual approach; see, e.g., [2]. Similarly, the incremental schemes in e.g., [7], [19], [21], [24] are all 
based on incremental (sub)gradient methods [17]. Even the diffusion LMS algorithm of [14] has been 
recently shown related to incremental strategies, when these are adopted to optimize an approximate 
reformulation of the LMS cost [5]. Building on the framework introduced by [27], the D-LMS and D-RLS 
algorithms in [15], [16], [26] are obtained upon recasting the respective decentralized estimation problems 
as multiple equivalent constrained subproblems. The resulting minimization subtasks are shown to be highly 
paralellizable across sensors, when carried out using the alternating-direction method of multipliers (AD- 
MoM) [2]. Much related to the AD-MoM is the alternating minimization algorithm (AMA) [31], used here 
to develop a novel D-RLS algorithm offering reduced complexity when compared to its counterpart of [15]. 

B. Contributions and paper outline 

The present paper develops a fully distributed (D-) RLS type of algorithm, which performs in-network, 
adaptive LS estimation. D-RLS is applicable to general ad hoc WSNs that are challenged by additive 
communication noise, and may lack a Hamiltonian cycle altogether. Different from the distributed Kalman 
trackers of e.g., [6], [22], the universality of the LS principle broadens the applicability of D-RLS to a wide 
class of distributed adaptive estimation tasks, since it requires no knowledge of the underlying state space 
model. The algorithm is developed by reformulating the EWLSE into an equivalent constrained form [27], 
which can be minimized in a distributed fashion by capitalizing on the separable structure of the augmented 
Lagrangian using the AMA solver in [31] (Section II). From an algorithmic standpoint, the novel distributed 
iterations here offer two extra features relative to the AD-MoM-based D-RLS variants in [15], [25]. First, 
as discussed in Section II-B the per sensor computational complexity is markedly reduced, since there is no 
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need to explicitly carry out a matrix inversion per iteration as in [15]. Second, the approach here bypasses 
the need of the so-termed bridge sensors [25]. As a result, a fully distributed algorithm is obtained whereby 
all sensors perform the same tasks in a more efficient manner, without introducing hierarchies that may 
require intricate recovery protocols to cope with sensor failures. 

Another contribution of the present paper pertains to a detailed stability and MSE steady-state per- 
formance analysis for D-RLS (Section IV). These theoretical results were lacking in the algorithmic pa- 
pers [15], [25], where claims were only supported via computer simulations. Evaluating the performance of 
(centralized) adaptive filters is a challenging problem in its own right; prior art is surveyed in e.g., [28], [29, 
pg. 120], [23, pg. 357], and the extensive list of references therein. On top of that, a WSN setting introduces 
unique challenges in the analysis such as space-time sensor data and multiple sources of additive noise, 
a consequence of imperfect sensors and communication links. The approach pursued here capitalizes on 
an 'averaged' error-form representation of the local recursions comprising D-RLS, as a global dynamical 
system described by a stochastic difference-equation derived in Section III-B. The covariance matrix of 
the resulting state is then shown to encompass all the information needed to evaluate the relevant global 
and sensor-level performance metrics (Section III-C). For sensor observations that are linearly related 
to the time-invariant parameter vector sought, the simplifying independence setting assumptions [29, pg. 
110], [23, pg. 448] are key enablers towards deriving accurate closed-form expressions for the mean-square 
deviation and excess-MSE steady-state values (Section IV-B). Stability in the mean- and MSE-sense are 
also investigated, revealing easily-checkable sufficient conditions under which a steady-state is attained. 

Numerical tests corroborating the theoretical findings are presented in Section V, while concluding 
remarks and possible directions for future work are given in Section VI. 

Notation: Operators (g), (.) T , (■)', A max (.), tr(.), diag(.), bdiag(.), E[.], vec [.] will denote Kronecker 
product, transposition, matrix pseudo-inverse, spectral radius, matrix trace, diagonal matrix, block diagonal 
matrix, expectation, and matrix vectorization, respectively. For both vectors and matrices, |.| will stand 
for the 2— norm, and |.| for the cardinality of a set or the magnitude of a scalar. Positive definite mattix M 
will be denoted by M y 0. The n x n identity matrix will be represented by I n , while l n will denote the 
n x 1 vector of all ones and l nX m '■= ^-nXm- Similar notation will be adopted for vectors (matrices) of all 
zeros. For matrix M £ M mxn , nullspace(M) := {x £ 1" : Mx = m }. The i-th vector in the canonical 
basis for W 1 will be denoted by h n ^, i = 1, . . . , n. 
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II. Problem Statement and Distributed RLS Algorithm 

Consider a WSN with sensors {1, . . . , J} := J . Only single-hop communications are allowed, i.e., sensor 
j can communicate only with the sensors in its neighborhood Afj C J, having cardinality \J\fj\. Assuming 
that inter-sensor links are symmetric, the WSN is modeled as an undirected connected graph with associated 
graph Laplacian matrix L. Different from [1], [3] and [24], the present network model accounts explicitly 
for non-ideal sensor-to-sensor links. Specifically, signals received at sensor j from sensor i at discrete- 
time instant t are corrupted by a zero-mean additive noise vector 77^- (t), assumed temporally and spatially 
uncorrected. The communication noise covariance matrices are denoted by := E[rfAt)(rfAt)) T \, 
j€j. 

The WSN is deployed to estimate a real signal vector so G M. pxl in a distributed fashion and subject to 
the single-hop communication constraints, by resorting to the LS criterion [23, p. 658]. Per time instant 
t = 0,1,..., each sensor acquires a regression vector hj(i) G W° xl and a scalar observation Xj(t), 
both assumed zero-mean without loss of generality. A similar setting comprising complex-valued data 
was considered in [3] and [24]. Here, the exposition focuses on real-valued quantities for simplicity, but 
extensions to the complex case are straightforward. Given new data sequentially acquired, a pertinent 

approach is to consider the EWLSE [3], [23], [24] 

t j 

s ewls (t) := arg mm £ £ A<~ r [ Xj (r) - hj(r)s] 2 + A*s T * s (1) 

r=0 j=l 

where A G (0, 1] is a forgetting factor, while $0 >~ Opxp is included for regularization. Note that in 
forming the EWLSE at time t, the entire history of data {xj(t), hj(r)}* =0 , V j G J is incorporated in the 
online estimation process. Whenever A < 1, past data are exponentially discarded thus enabling tracking 
of nonstationary processes. Regarding applications, a distributed power spectrum estimation task matching 
the aforementioned problem statement, can be found in [15]. 

To decompose the cost function in (1), in which summands are coupled through the global variable s, 
introduce auxiliary variables {sy}J =1 representing local estimates of so per sensor j. These local estimates 

are utilized to form the separable convex constrained minimization problem 

t J J 

{%(*)}/=! :=arg mm £ £ A^z^r) - hJ(r) Si ] 2 + J^A* £ sJ^qSj, 
l s ib=i r=0 j=1 j=1 

s. t. B j =s r , j G J, j' G Mj. (2) 

From the connectivity of the WSN, (1) and (2) are equivalent in the sense that Sj(t) = s ew i s (t), V j G J 
and t > 0; see also [27]. To arrive at the D-RLS recursions, it is convenient to reparametrize the constraint 
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set (2) in the equivalent form 

Sj-=zj', ay = zPj, and z7- = 7$, j € J, /gA/}, j^f. (3) 
where {zj ,tP- }ji^f., j G J, are auxiliary optimization variables that will be eventually eliminated. 

A. The D-RLS algorithm 

To tackle the constrained minimization problem (2) at time instant t, associate Lagrange multipliers vj 
and trj with the first pair of consensus constraints in (3). Introduce the ordinary Lagrangian function 

J t j 
C[s,z,v,u] = J2T / A ^ T K- (r) - hJ(r) Sj f + J^A* £ sj* s; 

j=l r=0 j=l 

+ E E [H' ) T (^ - <) + ("jVis,- - <)] (4) 

as well as the quadratically augmented Lagrangian 

J 



'j z j III + II s ?' ^ III 



(5) 



£ c [s,z,w,u] =£[s,z,v,u] + |E E [H S J 

where c is a positive penalty coefficient; and s := {sj}j =l , z := {z?- , zP- Yj£j 3 » and [«, tt] := {vj , trj H-^- 3 - 
Observe that the remaining constraints in (3), namely z £ C 2 := {z : zP- = , j G , j' G A/}, j 7^ j'}, 
have not been dualized. 

Towards deriving the D-RLS recursions, the alternating minimization algorithm (AMA) of [31] will be 
adopted here to tackle the separable EWLSE reformulation (2) in a distributed fashion. Much related to 
AMA is the alternating-direction method of multipliers (AD-MoM), an iterative augmented Lagrangian 
method specially well-suited for parallel processing [2], [15], [27]. While the AD-MoM has been proven 
successful to tackle the optimization tasks stemming from general distributed estimators of deterministic and 
(non-)stationary random signals, it is somehow curious that the AMA has remained largely underutilized. 

To minimize (2) at time instant t, the AMA solver entails an iterative procedure comprising three steps 
per iteration k = 0, 1, 2, . . . 

[SI] Multiplier updates: 

vf (t; k) = vj'(t; k - 1) + c[ Sj (t; k) - zj'(t; k)], j G J, f G Mj 
irj' (t; fc) = uf (t; fc - 1) + cfo/fo fc) - zf (t; k)], j G / G A/}. 
[S2] Local estimate updates: 

s(t, k + 1) = arg min£ [a, z(t, k), v(t, k), u(t, k)\ . (6) 



September 23, 2011 



DRAFT 



IEEE TRANSACTIONS ON SIGNAL PROCESSING (SUBMITTED) 



6 



[S3] Auxiliary variable updates: 

z(t, k + 1) = arg min C c [s(t, k + 1), z, v(t, k), u(t, k)] . (7) 

Steps [SI] and [S3] are identical to those in AD-MoM [2]. In words, these steps correspond to dual ascent 
iterations to update the Lagrange multipliers, and a block coordinate-descent minimization of the augmented 
Lagrangian with respect to z G C z , respectively. The only difference is with regards to the local estimate 
updates in [S2], where in AMA the new iterates are obtained by minimizing the ordinary Lagrangian with 
respect to s. For the sake of the aforementioned minimization, all other variables are considered fixed 
taking their most up to date values {z(t,k),v(t,k),u(t,k)}. For the AD-MoM instead, the minimized 
quantity is the augmented Lagrangian both in [S2] and in [S3]. 

The AMA was motivated in [31] for separable problems that are strictly convex in s, but (possibly) only 
convex with respect to z . Under this assumption, [S2] still yields a unique minimizer per iteration, and the 
AMA is useful for those cases in which the Lagrangian is much simpler to optimize than the augmented 
Lagrangian. Because of the regularization matrix <&o >~ pxp , the EWLS cost in (2) is indeed strictly 
convex for all t > 0, and the AMA is applicable. Section II-B dwells into the benefits of minimizing the 
ordinary Lagrangian instead of its augmented counterpart (5), in the context of distributed RLS estimation. 

Carrying out the minimization in [S3] first, one finds 

zf (t, k + 1) = zj'(t, k + 1) = - [ Sj (t, k + l) + s r (t, k + 1)}, j G J, f G Afj 
so that vj (t; k) = — irj (t; k) for all k > — 1 [15]. As a result vj (t; k) is given by 

yr>'(t-k) = vj\t-,k-l) + ^[ Sj (t-,k)-s r (t;k)}, j £ J, f G A/} . (8) 
Moving on to [S2], from the separable structure of (4) the minimization (6) can be split into J subproblems 

Sj(t, k + 1) = arg min 



^A^[x J (r)-hJ(r)s J ] 2 + J- 1 A*sJ* s i + £ [vf(t, k) - vj,(i, k) 

r=0 j'eAj 



Since each of the local subproblems corresponds to an unconstrained quadratic minimization, they all admit 
closed-form solutions 



Sj (t,k+i) = *j i w j (t)-l*- i (t) [vfM)-4(i,*o 



(9) 



where 



: = A *" Th i( r ) h J( r ) + ^ _1 A'*o = A*j-(t - 1) + hj(t)hj(t) (10) 

T=0 

t 

^(t) := A*- r h i (r)^(r) = X^(t - 1) + h ; (/). !/i. (11) 



r=0 
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Recursions (8) and (9) constitute the AMA-based D-RLS algorithm, whereby all sensors j G J keep track 
of their local estimate Sj(t; k + 1) and their multipliers {vj (t; k)}^^^, which can be arbitrarily initialized. 
From the rank-one update in (10) and capitalizing on the matrix inversion lemma, matrix can be 

efficiently updated according to 

with complexity 0{p 2 ). It is recommended to initialize the matrix recursion with SJ^O) = J*^ 1 := 5l p , 
where 5 > is chosen sufficiently large [23]. Not surprisingly, by direct application of the convergence 
results in [31, Proposition 3], it follows that: 

Proposition 1: For arbitrarily initialized {vj (t; — tyYj^j 1 > (t ; 0) and c S (0,c„); the local estimates 
Sj(t; k) generated by (9) reach consensus as k — > oo; i.e., 

lim Sj(t; k) = s ew i s (t), for all j € J . 

k— >oo 

The upper bound c u is proportional to the modulus of the strictly convex cost function in (2), and 
inversely proportional to the norm of a matrix suitably chosen to express the linear constraints in (3); 
further details are in [31, Section 4]. Proposition 1 asserts that per time instant t, the AMA-based D-RLS 
algorithm yields a sequence of local estimates that converge to the global EWLSE sought, as k — > oo, or, 
pragmatically for large enough k. In principle, one could argue that running many consensus iterations may 
not be a problem in a stationary environment. However, when the WSN is deployed to track a time-varying 
parameter vector So(i), one cannot afford significant delays in-between consecutive sensing instants. 

One possible way to overcome this hurdle is to run a single consensus iteration per acquired observation 
Xj(t). Specifically, letting k = t in recursions (8)-(9), one arrives at a single time scale D-RLS algorithm 
which is suitable for operation in nonstationary WSN environments. Accounting also for additive communi- 
cation noise that corrupts the exchanges of multipliers and local estimates, the per sensor tasks comprising 
the novel AMA-based single time scale D-RLS algorithm are given by 



vf (t) = vj'(i -!) + £ [s 3 (t) - (s f (t) + rr>'(t))] , f G Afj (13) 

<&~ (t + 1) = X $>~ (t) J V J (14) 

if> At + 1) = \i/>At) + hj(t + l)^-^ + 1) (15) 



s,-(t + l) = *7 1 (i + l)^.( t + l)-I*-i(t + l) J2 [vf(t)-(vj,(t) + <(t)) 



(16) 
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Algorithm 1 : AMA-based D-RLS 

Arbitrarily initialize {sj(0)}/ =1 and {vf(-l)}f e e ^. 
for t = 0,1,... do 

All j £ J: transmit Sj(t) to neighbors in Afj. 

All j £ J: update {\f (t)}jv eA/ -. using (13). 

All j £ J: transmit vj (t) to each j' £ Mj. 

All j £ J: update <&j(t + 1) and ipj(t + 1) using (14) and (15), respectively. 
All j £ J: update Sj(t + 1) using (16). 
end for 



Recursions (13)-(15) are tabulated as Algorithm 1, which also details the inter-sensor cornmunications 
of multipliers and local estimates taking place within neighborhoods. When powerful error control codes 
render inter-sensor links virtually ideal, direct application of the results in [15], [16] show that D-RLS can 
be further simplified to reduce the communication overhead and memory storage requirements. 

B. Comparison with the AD-MoM-based D-RLS algorithm 

A related D-RLS algorithm was put forth in [15], whereby the decomposable exponentially-weighted LS 
cost (2) is minimized using the AD-MoM, rather than the AMA as in Section II-A. Recall that the AD-MoM 
solver yields Sj(t+ 1) as the optimizer of the augmented Lagragian, while its AMA counterpart minimizes 
the ordinary Lagrangian instead. Consequently, different from (16) local estimates in the AD-MoM-based 
D-RLS algorithm of [15] are updated via 



f (t + l) = *7 1 (t + l)^.(t + l) + |*J 1 (t + l) £ [ Sj (t) + (s r (t)+ri]'(t)) 



(17) 



where [cf. (10)] 



*j(t) -^A'-^rlhJV) + rt'*o + c|A/}|V (18) 

r=0 

Unless A = 1, it is impossible to derive a rank-one update for *tj(t) as in (10). The reason is the 
regularization term c|A/j|I p in (18), a direct consequence of the quadratic penalty in the augmented 
Lagrangian (5). This prevents one from efficiently updating l (t + 1) in (17) using the matrix inversion 
lemma [cf. (14)]. Direct inversion of $j(t + 1) per iteration dominates the computational complexity of 
the AD-MoM-based D-RLS algorithm, which is roughly C(p 3 ) [15]. 
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Unfortunately, the penalty coefficient cannot be set to zero because the D-RLS algorithm breaks down. 
For instance, when the initial Lagrange multipliers are null and c = 0, D-RLS boils down to a purely 
local (L-) RLS algorithm where sensors do not cooperate, hence consensus cannot be attained. All in 
all, the novel AMA-based D-RLS algorithm of this paper offers an improved alternative with an order 
of magnitude reduction in terms of computational complexity per sensor. With regards to communication 
cost, the AD-MoM-based D-RLS and Algorithm 1 here incur identical overheads; see [15, Sec. III-B] for 
a detailed analysis of the associated cost, as well as comparisons with the I-RLS [24] and diffusion RLS 
algorithms [3]. 

While the AMA-based D-RLS algorithm is less complex computationally than its AD-MoM counterpart 
in [15], Proposition 1 asserts that when many consensus iterations can be afforded, convergence to the 
centralized EWLSE is guaranteed provided c € (0, c u ). On the other hand, the AD-MoM-based D-RLS 
algorithm will attain the EWLSE for any c > (cf. [15, Prop. 1]). In addition, it does not require tuning 
the extra parameter 5, since it is applicable when <&o = pxp because the augmented Lagrangian provides 
the needed regularization. 

III. Analysis Preliminaries 
A. Scope of the analysis: assumptions and approximations 

Performance evaluation of the D-RLS algorithm is much more involved than that of e.g., D-LMS [16], 
[26]. The challenges are well documented for the classical (centralized) LMS and RLS filters [23], [29], 
and results for the latter are less common and typically involve simplifying approximations. What is more, 
the distributed setting introduces unique challenges in the analysis. These include space-time sensor data 
and multiple sources of additive noise, a consequence of imperfect sensors and communication links. 

In order to proceed, a few typical modeling assumptions are introduced to delineate the scope of the 
ensuing stability and performance results. For all j £ J, it is assumed that: 

(al) Sensor observations adhere to the linear model Xj(t) = hJ(t)so + €j(t), where the zero-mean 
white noise {&j(t)} has variance a\ ; 

(a2) Vectors {hj (t)} are spatio-temporally white with covariance matrix R^. y pX p/ an d 

(a3) Vectors {hj(t)}, {ej(t)}, {77? (t)}f^j-. and {fij (t)}^^ are independent. 
Assumptions (al)-(a3) comprise the widely adopted independence setting, for sensor observations that are 
linearly related to the time-invariant parameter of interest; see e.g., [29, pg. 110], [23, pg. 448]. Clearly, 
(a2) can be violated in, e.g., FIR filtering of signals (regressors) with a shift structure as in the distributed 
power spectrum estimation problem described in [26] and [15]. Nevertheless, the steady-state performance 
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results extend accurately to the pragmatic setup that involves time-correlated sensor data; see also the 
numerical tests in Section V. In line with a distributed setting such as a WSN, the statistical profiles of 
both regressors and the noise quantities vary across sensors (space), yet they are assumed to remain time 
invariant. For a related analysis of a distributed LMS algorithm operating in a nonstationary environment, 
the reader is referred to [16]. 

In the particular case of the D-RLS algorithm, a unique challenge stems from the stochastic matrices 
present in the local estimate updates (16). Recalling (10), it is apparent that depends 
upon the whole history of local regression vectors {hj(r)}^. =0 . Even obtaining $J 1 (t)'s distribution or 
computing its expected value is a formidable task in general, due to the matrix inversion operation. It is for 
these reasons that some simplifying approximations will be adopted in the sequel, to carry out the analysis 
that otherwise becomes intractable. 

Neglecting the regularization term in (10) that vanishes exponentially as t — > oo, the matrix &j(t) is 
obtained as an exponentially weighted moving average (EWMA). The EWMA can be seen as an average 
modulated by a sliding window of equivalent length 1/(1 — A), which clearly grows as A — > 1. This 
observation in conjunction with (a2) and the strong law of large numbers, justifies the approximation 

*j(t) « J5[*j(t)] = — ^r-, < A < 1 and t -> oo. (19) 
1 — A 

The expectation of &J l (t), on the other hand, is considerably harder to evaluate. To overcome this 
challenge, the following approximation will be invoked [3], [23] 

El&J 1 ^)] w E-^&jit)] « (1 - A)R^, 0< A < 1 and t -> oo. (20) 

It is admittedly a crude approximation at first sight, because E [X^ 1 ^ ^ -EpT]" 1 in general, for any random 
variable X. However, experimental evidence suggests that the approximation is sufficiently accurate for all 
practical purposes, when the forgetting factor approaches unity [23, p. 319]. 

B. Error-form D-RLS 

The approach here to steady-state performance analysis relies on an 'averaged' error-form system 
representation of D-RLS in (13)-(16), where $>~ 1 (t) in (16) is replaced by the approximation (1 — A)R^ . , 
for sufficiently large t. Somehow related approaches were adopted in [3] and [1]. Other noteworthy analysis 
techniques include the energy-conservation methodology in [35], [23, p. 287], and stochastic averaging [29, 
p. 229]. For performance analysis of distributed adaptive algorithms seeking time-invariant parameters, the 
former has been applied in e.g., [13], [14], while the latter can be found in [26]. 
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Towards obtaining such error-form representation, introduce the local estimation errors {yij (t) := Sj (t) — 

s o}/=i an d multiplier-based quantities {y2j(£) := \ Ylj'eU- ( v j (* ~~ 1) ~~ v i'(* ~~ l))}/=r ^ turns out that 
a convenient global state to describe the spatio-temporal dynamics of D-RLS in (13)-(16) is y(t) := 

[yf(*) yfWF = [yl,i(t) ■ ■ - ylj(t) yli(t) ■ ■ - ylA^f € ^ 2Jp - ^ addition, to concisely capture the 
effects of both observation and communication noise on the estimation errors across the WSN, define the 

Jpx 1 noise supervectors e(t) := Y1t=o ^ t T [hf ( r ) e i( T ) • • • hj( T ) e j( T )] T an d := [^(i) • • ■ ?7j(£)] T - 
Vectors {fjj(t)}j =1 represent the aggregate noise corrupting the multipliers received by sensor j at time 
instant t, and are given by 

Their respective covariance matrices are easily computable under (a2)-(a3). For instance, 

rp 1 - A 2 (* +1 ) , O , 



1 _ a 2 (* +1 ) 

Re(t) := £[e(£)e T (i)] = ^—^ bdiag(R ftl <, . . . ,R hj o^ 



(22) 



while the structure of R: 



E[rj(t)r] T (t)] is given in Appendix E. Two additional Jp x 1 com- 



munication noise supervectors are needed, namely r] a (t) := [(r]f(t)) J 

T 



l T 



(v a At)y\ and V/3 (t) :- 



(nf(*)r- (»#(*)) 



(9/ 



, where for j ^ J 

c 



(23) 



Finally, let (c/2)L<g>I p G M Jpx,7p be a matrix capturing the WSN connectivity pattern through the (scaled) 
graph Laplacian matrix L, and define R^" 1 := bdiag(R^ i , . . . , R^" ). Based on these definitions, it is 
possible to state the following important lemma established in Appendix A. 

Lemma 1: Let (al) and (a2) hold. Then for t > to with to sufficiently large while C A < 1, the global 
state y(t) approximately evolves according to 



y(t + l)=bdiag((l-\)R-\l Jp ) < 



< Ty(t) + 




e(t + l) + 






QjpxJp 




Ojpx Jp 



+ 


Ijp 



















(24) 



where the 2Jp x 2Jp matrix T consists of the Jp x Jp blocks [Y]n = — [T]2i = — L c [Y]i2 = 
~~ [Y]22 = — Ijp- initial condition y(to) should be selected as y(to) = bdiag(Ij p , L c )y'(to), where 
y'(to) is any vector in M. 2Jp . 

The convenience of representing y(i) as in Lemma 1 will become apparent in the sequel, especially 
when investigating sufficient conditions under which the D-RLS algorithm is stable in the mean sense 
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(Section IV-A). In addition, the covariance matrix of the state vector y(t) can be shown to encompass all 
the information needed to evaluate the relevant per sensor and networkwide performance figures of merit, 
the subject dealt with next. 

C. Performance Metrics 

When it comes to performance evaluation of adaptive algorithms, it is customary to consider as figures of 
merit the so-called MSE, excess mean-square error (EMSE), and mean-square deviation (MSD) [23], [29]. 
In the present setup for distributed adaptive estimation, it is pertinent to address both global (network- 
wide) and local (per-sensor) performance [14]. After recalling the definitions of the local a priori error 
ej(t) := Xj(t) — hj(t)sj(t — 1) and local estimation error yij{t) := &j(t) — So, the per-sensor performance 
metrics are defined as 

MSE,-(i) :=E[e](t)] 
EMSEj (t) :=E[(hj(t) yi j(t-l)f} 
MSDj(i) := E[\\ yij (t)\\ 2 ] 

whereas their global counterparts are defined as the respective averages across sensors, e.g., MSE(i) := 
J" 1 £/=i^[ e i(*) 2 ]. and so on. 

Next, it is shown that it suffices to evaluate the state covariance matrix H y (t) := E[y(t)y T (t)] in order 
to assess the aforementioned performance metrics. To this end, note that by virtue of (al) it is possible to 
write ej(t) = —hj(t)yij(t — l) + ej(t). Because yij(t— 1) is independent of the zero-mean {hj(t), ej(t)} 
under (al)-(a3), from the previous relationship between the a priori and estimation errors one finds that 
MSEj(t) = EMSEj (t) + a\.. Hence, it suffices to focus on the evaluation of EMSEj (t), through which 
MSEj (t) can also be determined under the assumption that the observation noise variances are known, or 
can be estimated for that matter. If H yi .(t) := E[yij(t)yJ j(t)] denotes the j-th local error covariance 
matrix, then MSDj(t) = tr(R 2/l (£)); and under (al)-(a3), a simple manipulation yields 

EMSEj (t) = E[ti((hj(t) yij (t - l)) 2 )] = tr(E[h,(t)hj(t) yi ,(t - l)y^(t - 1)]) 
= tr(E[hj(t)hJ(t)]S[ yi) j(t - l)y^.(t - 1)]) = tr(R hj R yi j (t - 1)). 

To derive corresponding formulas for the global performance figures of merit, let H yi (t) := E[y\(t)yJ (t)] 
denote the global error covariance matrix, and define Hh '■= E[Hh(t)] = bdiag(R/ ll , . . . ,Rfr ; ). It follows 
that MSD(t) = J~ 1 tr(R 2/1 (t)), and EMSE(t) = J- 1 tr(R h R 1 , 1 (t - 1)). 



September 23, 2011 



DRAFT 



IEEE TRANSACTIONS ON SIGNAL PROCESSING (SUBMITTED) 13 

TABLE I 

Evaluation of local and global figures of merit from R H (t) 





MSD 


EMSE 


MSE 


Local 


*([R»(t)]iij) 


tr(R hj [Ryit-lfinj) 


tr(R % [R y (t-l)]u, i )+^. 


Global 


./-^([R^i)]!!) 


■/"^(R^R^t- l)]n) 


J- 1 tr(R ft [R y (t - l)]n) + J- 1 £/=i 4 



It is now straightforward to recognize that Ry (i) indeed provides all the information needed to evaluate 
the performance of the D-RLS algorithm. For instance, observe that the global error covariance matrix 
R yi (t) corresponds to the Jp x Jp upper left submatrix of Hy(t), which is denoted by [Rj,(t)]n. Further, 
the j'-fh p x p diagonal submatrix (J = 1, . . . , J) of [R, y (t)]n is exactly H yi . (t), and is likewise denoted 
by [R y (t)]n j. For clarity, the aforementioned notational conventions regarding submatrices within R f/ (t) 
are illustrated in Fig. 1. In a nutshell, deriving a closed-form expression for Ry(£) enables the evaluation 
of all performance metrics of interest, as summarized in Table I. This task will be considered in Section 
IV-B. 

Remark 1 Since the 'average' system representation of y(t) in (24) relies on an approximation that 
becomes increasingly accurate as A — > 1 and t — > oo, so does the covariance recursion for R y (i) derived 
in Section IV-B. For this reason, the scope of the MSE performance analysis of this paper pertains to the 
steady-state behavior of the D-RLS algorithm. 

IV. Stability and Steady-State Performance Analysis 

In this section, stability and steady-state performance analyses are conducted for the D-RLS algorithm 
developed in Section II-A. Because recursions (13)-(16) are stochastic in nature, stability will be assessed 
both in the mean- and in the MSE-sense. The techniques presented here can be utilized with minimal 
modifications to derive analogous results for the AD-MoM-based D-RLS algorithm in [15]. 

A. Mean Stability 

Based on Lemma 1, it follows that D-RLS achieves consensus in the mean sense on the parameter sq. 
Proposition 2: Under (al)-(a3) and for <C A < 1, D-RLS achieves consensus in the mean, i.e., 

lim E[y lsj {t)] = p , Vjej 
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provided the penalty coefficient is chosen such that 



4 



< c < 



(25) 



(1- A)A max (R^(L®I p )) 



Proof: Based on (al)-(a3) and since the data is zero-mean, one obtains after taking expectations on 
(24) that E[y(t)] = bdiag((l — A)R / ^ 1 , Ij p )¥E[y(t — 1)]. The following lemma characterizes the spectrum 
of the transition matrix fi := bdiag((l — A)R7 ,I/ P )Y; see Appendix B for a proof. 

Lemma 2: Regardless of the value of c > 0, matrix fi := bdiag((l — A)R / ~ 1 ,Ij p )T G R 2J p x2J p has 
p eigenvalues equal to one. Further, the left eigenvectors associated with the unity eigenvalue have the 
structure \J = [Oi x jp q^], where E nullspace(L c ) and i = 1, . . . ,p. The remaining eigenvalues are 
equal to zero, or else have modulus strictly smaller than one provided c satisfies the bound (25). 

Back to establishing the mean stability result, let {u,} and {vf} respectively denote the collection of p 
right and left eigenvectors of fi associated with the eigenvalue one. By virtue of Lemma 2 and provided 
c satisfies the bound (25), one has that lim^oo f2* = Yli=i u « v f > hence, 



In obtaining the second equality, the structure for y(io) that is given in Lemma 1 was used. The last 
equality follows from the fact that q; £ nullspace(L c ) as per Lemma 2, thus completing the proof. ■ 
Before wrapping up this section, a comment is due on the sufficient condition (25). When performing 
distributed estimation under <C A < 1, the condition is actually not restrictive at all since a 1 — A 
factor is present in the denominator. When A is close to one, any practical choice of c > will result in 
asymptotically unbiased sensor estimates. Also note that the bound depends on the WSN topology, through 
the scaled graph Laplacian matrix L c . 

B. MSE Stability and Steady-State Performance 

In order to assess the steady-state MSE performance of the D-RLS algorithm, we will evaluate the figures 
of merit introduced in Section III-C. The limiting values of both the local (per sensor) and global (network- 
wide) MSE, excess mean-square error (EMSE), and mean-square deviation (MSD), will be assessed. To 
this end, it suffices to derive a closed-form expression for the global estimation error covariance matrix 
'R yi (t) := E[yi(t)yJ (t)], as already argued in Section III-C. 
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The next result provides an equivalent representation of the approximate D-RLS global recursion (24), 
that is more suitable for the recursive evaluation of H yi (t). First, introduce the p(X)/=i Wj\) x 1 vector 



r,(t) := [{(^Wf^eM •••{(*?/< WfW] 



(26) 



which comprises the receiver noise terms corrupting transmissions of local estimates across the whole 
network at time instant t, and define R77 := E[rf(t)rj T (t)]. For notational convenience, let ~R^\ := 
(l-A)R- 1 . 

Lemma 3: Under the assumptions of Lemma 1, the global state y(t) in (24) can be equivalently written 
as 

Ojpx Jp 



y(t + 1) = bdiag(I Jp , L c )z(t + 1) + 



(27) 



The inner state z(t) := [zT(t) z^(i)] r is arbitrarily initialized at time to, and updated according to 



z(t + l) = *z(t) + * 



Rm(p«-p^) 
c 



»7(t-l) + * 



R 



-1 



Ojpx Jp 



r; 



-1 



QjpxJp 



e(t+l) (28) 



where the 2Jp x 2Jp transition matrix \I/ consists of the blocks [^]ii = [^]i2 = — R^ \Lc [^21 = 
[^22 = L c Lj. Matrix C is chosen such that L C C = P/3 — P a , where the structure of the time-invariant 
matrices P a and Pa itf g/ven in Appendix E. 

Proof: See Appendix C. ■ 
The desired state y(t) is obtained as a rank-deficient linear transformation of the inner state z(i), plus a 
stochastic offset due to the presence of communication noise. A linear, time-invariant, first-order difference 
equation describes the dynamics of z(t), and hence of y(t), via the algebraic transformation in (27). 
The time-invariant nature of the transition matrix \l/ is due to the approximations QJ 1 ^) R^ A , j £ J, 
particularly accurate for large enough t > to- Examination of (28) reveals that the evolution of z{t) is driven 
by three stochastic input processes: i) communication noise rj(t — 1) affecting the transmission of local 
estimates; ii) communication noise fj(t — 1) contaminating the Lagrange multipliers; and hi) observation 
noise within e(f + 1). 

Focusing now on the calculation of R yi (t) = [R y (i)]n based on Lemma 3, observe from the upper 
Jp x 1 block of y(t + 1) in (27) that y x (t + 1) = z x (i + 1) + R^ x [fj(t) + (P Q - P p )rj(t)]. Under (a3), 
z±(t + 1) is independent of the zero-mean {i)(t),r](t)}; hence, 



R yi (t) - R 2l (i) + R^ [Rfy + (P Q - P / 3)R^(P Q - P/?) T ] R^a 



(29) 
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which prompts one to obtain R z (t) := E[z(t)z T (t)]. Specifically, the goal is to extract its upper-left JpxJp 
matrix block [R z (t)]n = R 2l (i). To this end, define the vectors 



Vx(t) :-- 



r; 



-i 



Ojpx Jp 



V(t), Vx(t) ■-- 



c 



(30) 



whose respective covariance matrices R^ := E[fj x (t)f]'^(t)] and R^ := E[rj x (t)r]^(t)) have a structure 
detailed in Appendix E. Also recall that e(t) depends on the entire history of regressors up to time instant t. 
Starting from (28) and capitalizing on (a2)-(a3), it is straightforward to obtain a first-order matrix recursion 
to update R z (t) as 



R z (t) = *R^(i - 1)* T + *R^ * T + *Rt7 * 5 





Rc(t) 


R M 


Ojpx Jp 




Ojpx Jp 



+ *R,e(i) 







R / t ,A 

Jpy.Jp 



( 



+ 



V 



R h,A 
QjpxJp 



:= *R,(t- 1)V T + -R u (t) 



(31) 



(32) 



where the cross-correlation matrix R ze (t) := £?[z(t — l)e T (t)] is recursively updated as (cf. Appendix D) 



R, e (t) = A*R 2e (i-l) + A 



Rc(t-l). 



(33) 



R M 

Ojpx Jp 

For notational brevity in what follows, R*/(i) in (32) denotes all the covariance forcing terms in the 
right-hand side of (31). The main result of this section pertains to MSE stability of the D-RLS algorithm, 
and provides a checkable sufficient condition under which the global error covariance matrix H yi (t) has 
bounded entries as t — > oo. Recall that a matrix is termed stable, when all its eigenvalues lie strictly inside 
the unit circle. 

Proposition 3: Under (al )-(a3) and for <C A < 1, D-RLS is MSE stable, i.e., rimt_ s>00 R yi (t) has bounded 
entries, provided that c > is chosen so that \I/ is a stable matrix. 
Proof: First observe that because A € (0, 1), it holds that 

-i \2(t+l)\ 



lim Re(t) = lim 



t— >oo 



t— >oo 



1 - A 2 



bdiag(R fc X 1 ,...,R h X J ) 



(34) 
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If c > is selected such that * is a stable matrix, then clearly A* is also stable, and hence the matrix 
recursion (33) converges to the bounded limit 

R ze (oo) = (Iajp-A*) - 

Based on the previous arguments, it follows that the forcing matrix Rj/(i) in (31) will also attain a bounded 
limit as t — > oo, denoted as Rp>(oo). Next, we show that lim^oo R 2 (t) has bounded entries by studying 
its equivalent vectorized dynamical system. Upon vectorizing (32), it follows that 

\ec\R z (t)] =vec[*R z (t - 1)* T ] + vec\R v {t)] 

= (* ® *) vec[R 2 (t - 1)] + vec[R u (t)} 

where in obtaining the last equality we used the property vec [RST] = (T T ®R) vec [SI. Because the 
eigenvalues of \l/ (g> \l/ are the pairwise products of those of stability of \l/ implies stability of the 
Kronecker product. As a result, the vectorized recursion will converge to the limit 

vec[R z (oo)] = (l( 2Jp) 2 - * ® vec[Ri/(oo)] (36) 

which of course implies that lim^oo R 2 (t) = R 2 (oo) has bounded entries. From (29), the same holds true 
for H yi (t), and the proof is completed. ■ 
Proposition 3 asserts that the AMA-based D-RLS algorithm is stable in the MSE-sense, even when the 
WSN links are challenged by additive noise. While most distributed adaptive estimation works have only 
looked at ideal inter-sensor links, others have adopted diminishing step-sizes to mitigate the undesirable 
effects of communication noise [10], [11]. This approach however, limits their applicability to stationary 
environments. Remarkably, the AMA-based D-RLS algorithm exhibits robustness to noise when using a 
constant step-size c, a feature that has also been observed for AD-MoM related distributed iterations in 
e.g., [26], [27], and [15]. 

As a byproduct, the proof of Proposition 3 also provides part of the recipe towards evaluating the 
steady-state MSE performance of the D-RLS algorithm. Indeed, by plugging (34) and (35) into (31) one 
obtains the steady-state covariance matrix R I/ .(oo). It is then possible to evaluate R 2 (oo), by reshaping 
the vectorized identity (36). Matrix R 2l (oo) can be extracted from the upper-left Jp x Jp matrix block of 
R 2 (oo), and the desired global error covariance matrix R yi (oo) = [Ry(oo)]n becomes available via (29). 
Closed-form evaluation of the MSE(oo), EMSE(oo) and MSD(oo) for every sensor j e S is now possible 
given Ry^oo), by resorting to the formulae in Table I. 

Before closing this section, an alternative notion of stochastic stability that readily follows from Proposi- 
tion 3 is established here. Specifically, it is possible to show that under the independence setting assumptions 



Ojpx Jp 



Re(oo) 



(35) 
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(al)-(a3) considered so far, the global error norm ||yi(i)|| remains most of the time within a finite interval, 
i.e., errors are weakly stochastic bounded (WSB) [28], [29, pg. 110]. This WSB stability guarantees that 
for any 9 > 0, there exists a ( > such that Pr[||yi(i)|| < (] = 1 — uniformly in time. 

Corollary 1: Under (al)-(a3) and for <C A < 1, if c > is chosen so that \l/ is a stable matrix, then the 
D-RLS algorithm yields estimation errors which are WSB; i.e., lim^oo sup t>to Pr[||yi(t)|| > Q = 0. 
Proof: Chebyshev's inequality implies that 

P r[ || yi(i) ||> (] <M^ = Mil). (37) 

From Proposition 3, lim t _ >00 [R y (i)]n has bounded entries, implying thatsup t>to tr([Ry(t)]n) < oo. Taking 
the limit as ( — > oo, while relying on the bound in (37) which holds for all values of t > to, yields the 
desired result. ■ 
In words, Corollary 1 ensures that with overwhelming probability, local sensor estimates remain inside a 
ball with finite radius, centered at so- It is certainly a weak notion of stability, many times the only one that 
can be asserted when the presence of, e.g., time-correlated data, renders variance calculations impossible; 
see also [26], [28]. In this case where stronger assumptions are invoked, WSB follows immediately once 
MSE-sense stability is established. Nevertheless, it is an important practical notion as it ensures - on a 
per-realization basis - that estimation errors have no probability mass escaping to infinity. In particular, 
D-RLS estimation errors are shown WSB in the presence of communication noise; a property not enjoyed 
by other distributed iterations for e.g., consenting on averages [34]. 

V. Numerical Tests 

Computer simulations are carried out here to corroborate the analytical results of Section IV-B. Even 
though based on simplifying assumptions and approximations, the usefulness of the analysis is justified 
since the predicted steady-state MSE figures of merit accurately match the empirical D-RLS limiting 
values. In accordance with the adaptive filtering folklore, when A — > 1 the upshot of the analysis under the 
independence setting assumptions is shown to extend accurately to the pragmatic scenario whereby sensors 
acquire time-correlated data. For J = 15 sensors, a connected ad hoc WSN is generated as a realization of 
the random geometric graph model on the unit-square, with communication range r = 0.3 [9]. To model 
non-ideal inter-sensor links, additive white Gaussian noise (AWGN) with variance crjj = 10" 1 is added at 
the receiving end. The WSN used for the experiments is depicted in Fig. 2. 

With p = 4 and sq = l p , observations obey a linear model [cf. (al)] with sensing WGN of spatial 
variance profile of. = 10 _3 Oj, where aij ~ U[0, 1] (uniform distribution) and i.i.d.. The regression vectors 
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hj(t) := [hj(t) . . . hj(t — p + 1)] T have a shift structure, and entries which evolve according to first- 
order stable autoregressive processes hj(t) = (1 — p)(3jhj(t — 1) + ^fpujj(t) for all j € J . We choose 
p = 5 x 10 _1 , the /3j ~ U[0, 1] i.i.d. in space, and the driving white noise ujj(t) ~ U[— v^cr^. , \/3o"wJ 
with spatial variance profile given by a^. = with 7-,- ~ ZY[0, 1] and i.i.d.. Observe that the data is 
temporally-correlated, implying that (a2) does not hold here. 

For all experimental performance curves obtained by running the algorithms, the ensemble averages are 
approximated by sample averaging 200 runs of the experiment. 

First, with A = 0.95, c = 0.1 and 5 = 100 for the AMA-based D-RLS algorithm, Fig. 3 depicts the 
network performance through the evolution of the EMSE(t) and MSD(t) figures of merit. Both noisy 
and ideal links are considered. The steady-state limiting values found in Section IV-B are extremely 
accurate, even though the simulated data does not adhere to (a2), and the results are based on simplifying 
approximations. As intuitively expected and analytically corroborated via the noise-related additive terms 
in (29) and (31), the performance penalty due to non-ideal links is also apparent. 

We also utilize the analytical results developed throughout this paper to contrast the per sensor perfor- 
mance of D-RLS and the D-LMS algorithm in [16]. In particular, the parameters chosen for D-LMS are 
/i = 5x 10~ 3 and c = 1. Fig. 4 shows the values of the EMSEj(oo) and MSD,(oo) for all j <E J. As 
expected, the second-order D-RLS scheme attains improved steady-state performance uniformly across all 
sensors in the simulated WSN. In this particular simulated test, gains as high as 5dB in estimation error can 
be achieved at the price of increasing computational burden per sensor, from 0{p) to 0{p 2 ) per iteration. 

VI. Concluding Summary and Future Work 

A distributed RLS-like algorithm is developed in this paper, which is capable of performing adaptive 
estimation and tracking using WSNs in which sensors cooperate with single-hop neighbors. The WSNs 
considered here are quite general since they do not necessarily possess a Hamiltonian cycle, while the inter- 
sensor links are challenged by communication noise. Distributed iterations are derived after: i) reformulating 
in a separable way the exponentially weighed least-squares (EWLS) cost involved in the classical RLS 
algorithm; and ii) applying the AMA to minimize this separable cost in a distributed fashion. The AMA 
is especially well-suited to capitalize on the strict convexity of the EWLS cost, and thus offer significant 
reductions in computational complexity per sensor, when compared to existing alternatives. This way, salient 
features of the classical RLS algorithm are shown to carry over to a distributed WSN setting, namely 
reduced-complexity estimation when a state and/or data model is not available and fast convergence rates 
are at a premium. 
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An additional contribution of this paper pertains to a detailed steady-state MSE performance analysis, 
that relies on an 'averaged' error-form system representation of D-RLS. The theory is developed under 
some simplifying approximations, and resorting to the independence setting assumptions. This way, it is 
possible to obtain accurate closed-form expressions for both the per sensor and network-wide relevant 
performance metrics as t — > oo. Sufficient conditions under which the D-RLS algorithm is stable in the 
mean- and MSE-sense are provided as well. As a corollary, the D-RLS estimation errors are also shown 
to remain within a finite interval with high probability, even when the inter-sensor links are challenged 
by additive noise. Numerical simulations demonstrated that the analytical findings of this paper extend 
accurately to a more realistic WSN setting, whereby sensors acquire temporally correlated sensor data. 

Regarding the performance of the D-RLS algorithm, there are still several interesting directions to pursue 
as future work. First, it would be nice to establish a stochastic trajectory locking result which formally 
shows that as A — > 1, the D-RLS estimation error trajectories closely follow the ones of its time-invariant 
'averaged' system companion. Second, the steady-state MSE performance analysis was carried out when 
< A < 1. For the infinite memory case in which A = 1, numerical simulations indicate that D-RLS 
provides mean-square sense-consistent estimates, even in the presence of communication noise. By formally 
establishing this property, D-RLS becomes an even more appealing alternative for distributed parameter 
estimation in stationary environments. While the approximations used in this paper are no longer valid 
when A = 1, for Gaussian i.i.d. regressors matrix <1> -I (t) is Wishart distributed with known moments. 
Under these assumptions, consistency analysis is a subject of ongoing investigation. 

Appendix 

A. Proof of Lemma 1 : Let to be chosen large enough to ensure that 

t 



lim ^(t) = lim £ A*- T hi(r)hJ(r) + J^X^o « ^r, j € J. 

;-Ho t— n *■ — » J I — A 

T=0 



For t > to, consider replacing (i) in (16) with the approximation (1 — A)R ft for its expected value, 
to arrive at the 'average' D-RLS system recursions 

vf (t) = vf (t - 1) + \ [ Sj (t) - (s f (t) + rf: (t))] , f e Mj (38) 



Sj (t + !) = (!- A)B^-(< + 1) - hi - A)R^! H'W " (4^ + 



(39) 
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After summing (vj (t) — vj,(i))/2 over f £ Mj, it follows from (38) that for all j £ J 

y 2>j (t + 1) := \ J2 H' (*) - vj(t)) = y 2J (t) + | ^ ( B iW " s ^)) 



| (40) 



= Y2j(t) + \ E (yij(*) - yij'W) - (*) + ^ ^ 41 ) 

j'eA/J 

where the last equality was obtained after adding and subtracting c|A/}|s from the right-hand side of (40), 
and relying on the definitions in (23). Next, starting from (39) and upon: i) using (al) to eliminate 

t+1 t+1 T, t+1 



^(t + !) = £ A* +1 -^(r)hJ(T)s + E A t+1 ^h,- (r) Ci (r) 



R/, 



1 - A 



so 



+ VA t+1 -h J (r) ej (r) 



T=0 



T=0 T=0 

from (39); ii) recognizing y2,j(t + 1) in the right-hand side of (39) and substituting it with (41); and hi) 
replacing the sums of noise vectors with the quantities defined in (21) and (23); one arrives at 



-i 



yi>i (i + l) = (l-A)R 



+ (1 - A)R^ 



- y 

t+1 



(yi > iW-yi,i'(i))-y 2j (t) 



A f+1 ^h J (r)e i (r) + rfi(t) - ijf (t) + 



T=0 



(42) 



What remains to be shown is that after stacking the recursions (42) and (41) for j = 1, . . . , J to form 
the one for y(t + 1), we can obtain the compact representation in (24). Examining (41) and (42), it is 
apparent that a common matrix factor bdiag((l — A)R^" 1 , Ij p ) can be pulled out to symplify the expression 
for y(t + 1). Consider first the forcing terms in (24). Stacking the channel noise terms from (42) and 
(41), readily yields the last three terms inside the curly brackets in (24). Likewise, stacking the terms 
St=o A* +1 ~ T hj (r)ej(r) for j = 1, . . . , J yields the second term due to the observation noise; recall the 
definition of e(t + 1). This term as well as the vectors fjj(t) are not present in (41), which explains the 
zero vector at the lower part of the second and third terms inside the curly brackets of (24). 

To specify the structure of the transition matrix T, note that the first term on the right-hand side of (41) 
explains why [T]22 = Ijp- Similarly, the second term inside the first square brackets in (42) explains why 
[T]i2 = —I j p . Next, it follows readily that upon stacking the terms (c/2) X^/gjy. (yi j (t) — yij'(*))> which 
correspond to a scaled Laplacian-based combination of p x 1 vectors, one obtains [(c/2)L (g) I p ]yi(t) = 
L c yi(*)- This justifies why [T]n = -[T] 2 i = -L c . 

A comment is due regarding the initialization for t = to- Although the vectors {yij(to)}/=i are 
decoupled so that yi(io) can be chosen arbitrarily, this is not the case for {y2j(to)}/=i which are coupled 
and satisfy 

J J 

Ey2j(*) = E E (vf(t-i)-vj,(t-i)) = o p , vt>o. (43) 

j=l j=lj>£N 3 
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The coupling across {y2j(£)}/ = i dictates y2(to) to be chosen in compliance with (43), so that the system 
(24) is equivalent to (38) and (39) for all t > to- Let y2(to) = L c y' 2 (to), where y 2 (to) is anv vector in M Jp . 
Then, it is not difficult to see that y2(£o) satisfies the conservation law (43). In conclusion, for arbitrary 
y'(0) € R 2Jp the recursion (24) should be initialized as y(0) = bdiag(Ij p , L c )y'(0), and the proof of 
Lemma 1 is completed. ■ 



T T 



B. Proof of Lemma 2: Recall the structure of matrix T given in Lemma 1. A vector vj ;-- 
with {yy,j}j =1 € M Jpxl is a left eigenvector of Q associated to the eigenvalue one, if and only if it solves 
the following linear system of equations 

-v^(l-A)R^ 1 L c + vi; i L c = v^ 



-v^(l-A)Rr 1 + v 2 : i = v 



2.i 



The second equation can only be satisfied for vi j = 0j p , and upon substituting this value in the first 



equation one obtains that v 2j j £ nullspace(L c ) = nullspace(L 



I p ) for all values of c > 0. Under the 



assumption of a connected ad hoc WSN, nullspace(L) = span(lj) and hence nullspace(L 



is a 



p-dimensional subspace. 

Following steps similar to those in [27, Appendix H], it is possible to express the eigenvalues of Q 
that are different from one as the roots of a second-order polynomial. Such a polynomial does not have 
an independent term, so that some eigenvalues are zero. With respect to the rest of the eigenvalues, it is 
possible to show that their magnitude is upper bounded by A max (I jp — (1 — ^)R^ 1 L C ). Hence, it is possible 
to select c > such that A max (Ij p — (1 — A)R^ 1 L C ) < 1, or equivalently |1 — (1 — A)A max (R^ 1 L c )| < 1, 
which is the same as condition (25). ■ 
C. Proof of Lemma 3: The goal is to establish the equivalence between the dynamical systems in (24) 
and (27) for all t > to, when the inner state is arbitrarily initialized as z(to) = y'(^o)- We will argue by 
induction. For t = t , it follows from (28) that z(t + 1) = *y'(*o) + \^h\ ° T ] Te (*o + 1), since (by 
convention) there is no communication noise for t < to- Upon substituting z(to + 1) into (27), we find 



y(t + 1) = bdiag(I Jp , L c )*y'(t ) + 



(e(t + l) + T7(t )) + 



*K*o). (44) 



QjpxJp 

Note that: i) bdiag(I Jp , L c )* = Tbdiag(I Jp , L c ); ii) y(t ) = bdiag(I Jp , L c )y'(t ) for the system in 
Lemma 1; and iii) rj a (t) = P a rj(t), while r/«(t) = Pprj(t) [cf. Appendix E]. Thus, the right-hand side of 
(44) is equal to the right-hand side of (24) for t = to- 

Suppose next that (27) and (28) hold true for y(t) and z(t), with t > to- The same will be shown for 
y(t + 1) and z(t + 1). To this end, replace y(t) with the right-hand side of (27) evaluated at time t, into 



September 23, 2011 



DRAFT 



IEEE TRANSACTIONS ON SIGNAL PROCESSING (SUBMITTED) 



23 



(24) to obtain 

y(t + 1) = bdiag(R^ 1 A , I Jp ) { Ybdiag(I Jp , L c )z(t) + T 



R ft,A 



fj(t - 1) + 



Ijp 

Ojpx Jp 



e(t + l) 



+T 



R 



v(t-i) + 


Ijp 


tj(t) + 


Ijp 


»7a(<) " 


Ijp 










-Ijp 




— Ijp 



bdiag(I Jp ,L c ) *z(i) + * 



r; 



-i 



"Ti.A 
Ojpx Jp 



*7(t-l) + * 



c 



ij(t - 1) 



R M 


c(t + l)j + 




v(t) + 


r m(p« - P/j) 




(45) 


Ojpx Jp 




QjpxJp 




P/3-Pa 







where in obtaining the last equality in (45), the following were used: i) bdiag(Ij p , L c )\l/ = Tbdiag(Ij p , L c ) 
; ii) the relationship between r) a (t),r)a(t) and r)(t) given in Appendix E; and hi) the existence of a matrix 
C such that L C C = — P Q . This made possible to extract the common factor bdiag(Ij p , L c ) and deduce 
from (45) that y(t + 1) is given by (27), while z(t + 1) is provided by (28). 

In order to complete the proof, one must show the existence of matrix C. To this end, via a simple 
evaluation one can check that nullspace(L c ) C nullspace(Pj — P„), and since L c is symmetric, one has 
nullspace(L c )_Lrange(L c ). As nullspace(Pj — PQ)_Lrange(P ( g — P a ), it follows that range(P ( g — P a ) C 
range(L c ), which further implies that there exists C such that L C C = — P a . ■ 
D. Derivation of (33): First observe that the noise supervector e(t) obeys the first-order recursion 



e(t) := £ A t - r [hf(r) ei (r) . . . hJ(r) ej (r)] T = Xe(t - 1) + ptf(t) Cl (t) . . . h T j(t)ej(t)f . 



(46) 



T=0 



Because under (a3) the zero-mean {ej(t)}j£j are independent of z(t — 1) [cf. (28)], it follows readily that 
Rze(t) := E[z(t — l)e T (t)] = XE[z(t — l)e T (i — 1)]. Plugging the expression for z(t — 1) and carrying 
out the expectation yields 



E[z(t - l)e T (t - 1)] = VE[z{t - 2)e T (t - 1)] + * 



R 



-l/p 

c 



R M 

QjpxJp 



E[f)(t-3)e T (t-l)] + 



R, 



-i 



QjpxJp 



E[rj(t-3)e T (t-l)} 
E[e(t- l)e T (t- 1)] 



*R ze (t-l) + 



R^ 



Re(t - 1) 



(47) 



OjpxJp 

The second equality follows from the fact that the zero-mean communication noise vectors are independent 
of e(t — 1). Scaling (47) by A yields the desired result. 
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E. Structure of matrices P a , Pp, Tiff, Rrj, ^fj x > ana " ^r] x ' In order to relate the noise supervectors 
r/ a (t) and rjp(t) with rj(t) in (26), introduce two Jp x (%2j=i Wj\)P matrices P a := [pi 



Pj] T and 



\TlT 



P/3 := [pi . . . p'j] T . The (D,=i Wj\)P x P submatrices pj, p'j are given by pj := [(pj,i) T • • • (pj,j) 



T]T 



and p'j := [(p^i) • • • (p'j,j) V > witn Pj,r, Pj',r defined for r = 1, . . . , J as 



3,r ■' 



4 D |JV r |,r(j) 



^VX \JV r \p 



(P',,) T : 



if r = j 
if r ^ j 



Ip if J e A/" r 

if j ^ A/j- I Opx|AA,-|p 

Note that r(j) G {1, . . . , \M r \} denotes the order in which rfj(t) appears in {fj^,(t)}ji e j^ r [cf. (26)]. It is 
straightforward to verify that rj a {t) = P a rj(t) and rjg(t) = Pprj(t). 

Moving on to characterize the structure of and R^, from (21) and recalling that communication 
noise vectors are assumed uncorrected in space [cf. (a3)], it follows that 



R^ = bdiag[ Yl R ^. J "-' Rt W 

j'6A/i\{l} j'£Afj\{J} 

Likewise, it follows from (26) that R^ is a block diagonal matrix with a total of Ylj=i 1-^0 1 diagonal 
blocks of size p x p, namely 



Krj = bdiag [{Rrj^ JfeM, • • • , I 1 **/ . ,}.,'• .V. J • 

Note also that the blocks Hrj = O pxp for all j G J, since a sensor does not communicate with itself. In 
both cases, the block diagonal structure of the covariance matrices is due to the spatial uncorrelatedness 
of the noise vectors. 

What is left to determine is the structure of R^ and P^r) x - From (30) one readily obtains 



R 









T 


R^ 






Ojpx Jp 


Ojpx Jp 



R 



C 



R 



r m(P--P/3) 
c 



(48) 
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R yij (t) = [R y (0]ii,jeM pxp 



R y (t) = 



e M. 2J p x2J p 



R yi (t) = [R y (t)]nGK J PX* 



Fig. 1. The covariance matrix R w (t) and some of its inner submatrices that are relevant to the performance evaluation of the 
D-RLS algorithm. 




Fig. 2. An ad hoc WSN with J — 15 sensors, generated as a realization of the random geometric graph model on the unity 
square, with communication range r = 0.3. 
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w/noise: empirical evolution 
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Fig. 3. Global steady-state performance evaluation. D-RLS is ran with ideal links and when communication noise with variance 
a% = 10 _1 is present. 
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Fig. 4. Local steady-state performance evaluation. D-RLS is compared to the D-LMS algorithm in [16]. 
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